Back

Biological Imaging

Cambridge University Press (CUP)

Preprints posted in the last 30 days, ranked by how well they match Biological Imaging's content profile, based on 15 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
EpiCure (Epithelial Curation): a versatile and handy tool for curation of epithelial segmentation

Letort, G.; Valon, L.; Michaut, A.; Cumming, T.; Xenard, L.; Phan, M.-S.; Dray, N.; Rueden, C. T.; Schweisguth, F.; Gros, J.; Bally-Cuif, L.; Tinevez, J.-Y.; Levayer, R.

2026-03-27 developmental biology 10.64898/2026.03.27.714683 medRxiv
Top 0.1%
1.6%
Show abstract

Investigating single-cell dynamics and morphology in tissues and embryos requires highly accurate quantitative analysis of microscopy images. Despite significant advances in the field of bioimage analysis, even the most sophisticated segmentation and tracking algorithms inevitably produce errors (e.g. : over segmentation, missing objects, miss-connected objects). Although error rate may be small, their propagation throughout a time-lapse sequence has catastrophic effects on the accuracy of tracking and extraction of single cell parameters. Extracting single cell temporal information in the context of tissue/embryo requires thus expert curation to identify and correct segmentation errors. In the movies commonly used in developmental biology and stem cell research, both the number of imaged cells and the duration of recording are large, making this manual correction task extremely time-consuming. This has now become a major bottleneck in the fields of development, stem cell biology and bioimage analysis. We present here EpiCure (Epithelial Curation), a versatile tool designed to streamline and accelerate manual curation of segmentation and tracking in 2D movies of large epithelial tissues. EpiCure uses temporal information and morphometric parameters to automatically identify segmentation and tracking errors and provides user-friendly tools to correct them. It focuses on ergonomics and offers several visualization options to help navigating in movies of tissue covering a large number of cells, speeding up the detection of errors and their curation. EpiCure is highly interoperable and supports input from a wide range of segmentation tools. It also includes multiple export filters, enabling seamless integration with downstream analysis pipelines. In this paper, using movies from several animal models, we highlight the importance of curating cell segmentation and tracking for accurate downstream analysis, and demonstrate how EpiCure helps the curation process for extracting accurate single cell dynamics and cellular events detection, making it faster and amenable on large dataset.

2
Quantitative comparison of fluorescent reporters by FCS excitation scan

Schneider, F.; Trinh, L. A.; Fraser, S. E.

2026-04-05 biophysics 10.64898/2026.04.04.716477 medRxiv
Top 0.1%
1.2%
Show abstract

Fluorescent reporters such as fluorescent proteins or chemigenetic indicators are indispensable tools for studying biological processes using light microscopy. Choosing an appropriate fluorescent tag is a crucial step in experimental design not only for imaging but also for quantitative measurements such as fluorescence fluctuation spectroscopy. Two key parameters should be considered: Fluorescent brightness and photo-bleaching. Change to fluorescence intensity due to photobleaching is relatively easy to assess in different biological environments, while brightness is more elusive. Here, we develop and employ a fluorescence correlation spectroscopy (FCS) based excitation scan assay that determines fluorescent protein performance and validate it in tissue culture and zebrafish embryos. We employ our FCS pipeline to compare a set of 10 established fluorescent proteins as well as HALO and SNAP tags for both cellular imaging and measurements of diffusion dynamics with FCS. We show that mNeonGreen outperforms mEGFP in tissue culture and zebrafish embryos. We also compare StayGold variants against other green fluorescent proteins and chemigenetic reporters in tissue culture. Overall, we present a broadly applicable approach for determining fluorescent reporter brightness in the living system of interest.

3
Quantitative assessment of collagen architecture from routine histopathological images shows concordance with Second Harmonic Generation microscopy

Ingawale, V.; Dandapat, K.; Konkada Manattayil, J.; Gupta, S.; Shashidhara, L. S.; Koppiker, C.; Shah, N.; Raghunathan, V.; Kulkarni, M.

2026-04-06 pathology 10.64898/2026.03.31.26349841 medRxiv
Top 0.2%
0.9%
Show abstract

Collagen organisation within the tumour microenvironment plays a critical role in tumour progression and has emerged as an important structural biomarker in cancer. Second Harmonic Generation (SHG) microscopy enables label-free visualisation and quantitative assessment of fibrillar collagen architecture; however, its high cost, specialised instrumentation, and limited field-of-view restrict routine clinical application. In this study, we evaluated whether collagen features quantified from digitally scanned Masson-Goldners Trichrome-stained histopathological sections can approximate measurements obtained from SHG microscopy. Formalin-fixed paraffin-embedded breast tumour tissues, including benign and invasive ductal carcinoma (IDC) samples with varying collagen content, were analysed using SHG microscopy and whole-slide brightfield imaging. Matched regions of interest were analysed using two independent digital image analysis approaches: a conventional ImageJ-based workflow (TWOMBLI) and a machine learning-based computational pipeline. Collagen structural parameters including collagen deposition area, fibre number, and alignment metrics were quantified and compared across imaging modalities using correlation analysis. SHG signals were consistently detected from trichrome-stained sections, confirming compatibility of SHG imaging. Quantitative comparison demonstrated significant concordance between SHG-derived collagen metrics and those obtained from digital image analysis pipelines, particularly for collagen area and fibre alignment. These findings demonstrate that computational analysis of routine histopathological images can capture key spatial features of collagen organisation comparable to SHG microscopy. Digital pathology-based collagen quantification therefore, represents a scalable and clinically accessible approach for assessing extracellular matrix architecture in tumour tissues.

4
Variable Resolution Maps (VRM) in CCTBX and Phenix: Accounting For Local Resolution In cryoEM

Afonine, P.; Adams, P. D.; Urzhumtsev, A. G.

2026-03-28 bioinformatics 10.64898/2026.03.25.714315 medRxiv
Top 0.2%
0.8%
Show abstract

Calculation of density maps from atomic models is essential for structural studies using crystallography and electron cryo-microscopy (cryoEM). These maps serve various purposes, including atomic model building, refinement, visualization, and validation. However, accurately comparing model-calculated maps to experimental data poses challenges, particularly because the resolution of cryoEM experimental maps varies across the map. Traditional crystallography methods generate finite-resolution maps with uniform resolution throughout the unit cell volume, while most modern software in cryoEM employ Gaussian-like functions to generate these maps, which does not adequately account for atomic model parameters and resolution. Recent work by Urzhumtsev & Lunin (2022, IUCr Journal, 9, 728-734) introduces a novel method for computing atomic model maps that incorporate local resolution and can be expressed as analytically differentiable functions of all atomic parameters. This approach enhances the accuracy of matching atomic models to experimental maps. In this paper, we detail the implementation of this method in CCTBX and Phenix. SynopsisNew tools implemented in CCTBX and Phenix allow the calculation of variable-resolution maps through a sum of atomic images expressed as analytic functions of all atomic parameters, along with their associated local resolution.

5
BrightEyes-FFS: an open-source platform for comprehensive analysis of fluorescence fluctuation spectroscopy experiments with small detector arrays

Slenders, E.; Perego, E.; Zappone, S.; Vicidomini, G.

2026-04-10 bioinformatics 10.64898/2026.04.08.717207 medRxiv
Top 0.2%
0.7%
Show abstract

Fluorescence fluctuation spectroscopy (FFS) is an ensemble of techniques for quantitative measurement of molecular dynamics and interactions. Recently, the introduction of small-format array detectors has opened up a new range of spatiotemporal information, allowing for more detailed analysis of system kinetics. However, there is currently no open-source software available for analyzing the high-dimensional FFS data sets. We present BrightEyes-FFS, an open-source Python-based environment for FFS analysis with array detectors. The environment includes a Python package for reading raw FFS data, computing auto- and cross-correlations using various algorithms, and fitting the correlations to several models. A graphical user interface (GUI), available as a standalone executable, makes the analysis fast and user-friendly. An automated Jupyter Notebook writing tool enables transition from the GUI to Jupyter Notebook for custom analysis. We believe that BrightEyes-FFS will enable a wider community to study diffusion, flow, and interaction dynamics.

6
CosMxScope: Scalable Reconstruction and Digital Pathology Integration of Imaging-Based Spatial Transcriptomics Data

Chen, J.; Isett, B.; Gu, Q.; Bao, R.

2026-03-30 bioinformatics 10.64898/2026.03.25.713520 medRxiv
Top 0.3%
0.7%
Show abstract

Spatial transcriptomics technologies have transformed the capacity to quantify gene expression from human tissues, simultaneously capturing both the cell functional state and spatial organization at cellular and subcellular resolution. The CosMx Spatial Molecular Imager (SMI) is one of the leading platforms at single-cell spatial multi-modal omics profiling, capable of measuring thousands of RNA or protein targets per cell across whole slide sections. Data exported includes field-of-view (FOV) image tiles, subcellular transcript coordinates, and cell segmentation polygons, outputs that are rich in spatial information but not directly compatible with widely used digital pathology tools. Here we present CosMxScope, a lightweight open-source Python framework that bridges CosMx spatial outputs with histopathology visualization environments. CosMxScope provides functions for stitching FOV image tiles into reconstructed whole-slide images, converting cell segmentation polygons and transcript coordinates into GeoJSON objects which enabled further assessments within QuPath, and generating spatial visualization plots of cell types, transcript locations, and gene expression patterns. The framework is designed for practical use in translational research settings, enabling interactive exploration of spatial transcriptomic data alongside cell morphology. CosMxScope has been applied in multiple ongoing research projects involving CosMx profiling of human and mouse tissues, supporting pathology-based spatial analysis workflows. This open-source software is available at https://github.com/AivaraX-AI/CosMxScope.

7
Correlate: A Web Application for Analyzing Gene Sets and Exploring Gene Dependencies Using CRISPR Screen Data

Deolankar, S.; Wermeling, F.

2026-04-04 bioinformatics 10.64898/2026.04.02.716070 medRxiv
Top 0.3%
0.7%
Show abstract

CRISPR screen data provides a valuable resource for understanding gene function and identifying potential drug targets. Here, we present Correlate, a freely accessible web application (https://correlate.cmm.se) that enables exploration of the Cancer Dependency Map (DepMap) CRISPR screen gene effects, hotspot mutations, and translocation/fusion data across more than 1,000 human cancer cell lines. The application supports two main use cases: (i) analysis of user-defined gene sets (e.g. CRISPR screen hits) to identify functionally linked genes based on correlations while providing an overview based on essentiality or user-provided screen statistics; and (ii) exploration of genes of interest in defined biological contexts, such as specific cancer types or mutational backgrounds, to generate hypotheses about gene function and dependencies. Additionally, Correlate supports experimental design by providing rapid overviews of gene essentiality and enabling the identification of cell lines with relevant mutational profiles. In contrast to knowledge-based approaches such as STRING and GSEA, which rely on prior biological annotations and curated interaction networks, Correlate identifies gene connections directly from functional CRISPR screen readouts, offering a complementary and data-driven perspective on gene network analysis. The application runs entirely in the browser, requires no installation or login, and integrates with the Green Listed v2.0 tool family for custom CRISPR screen design. HIGHLIGHTS{blacksquare} Interactive web-based platform for bulk correlation analysis of user-defined gene sets using DepMap CRISPR screen data, requiring no installation or programming expertise. {blacksquare}Identifies functional gene relationships from CRISPR screen readouts rather than curated annotations, offering a data-driven complement to tools such as GSEA and STRING. {blacksquare}Enables contextual exploration of gene dependencies across cancer types and mutational backgrounds, supporting hypothesis generation about gene function and therapeutic targets. {blacksquare}Supports experimental design through gene essentiality overviews, mutation and fusion analysis, and cell line identification, with optional integration of user-provided statistics from CRISPR screens, proteomics, or transcriptomics analyses.

8
Artificial Intelligence Devices for Image Analysis in Digital Pathology

Matthews, G. A.; Godson, L.; McGenity, C.; Bansal, D.; Treanor, D.

2026-03-26 pathology 10.64898/2026.03.23.26349089 medRxiv
Top 0.3%
0.5%
Show abstract

BO_SCPLOWACKGROUNDC_SCPLOWThere is increasing momentum behind the clinical implementation of AI-based software for image analysis in digital pathology. As regulations, standards, and national approaches to the clinical use of AI continue to develop, the marketplace of AI products is expanding and evolving - presenting pathologists with a multitude of devices that offer the potential to improve pathology services. MO_SCPLOWETHODSC_SCPLOWTo maintain pace with this changing AI device landscape, we conducted a comprehensive search for, and analysis of, commercial AI products for image analysis in digital pathology. This included CE-marked and Research Use Only (RUO) products using images with histological stains (e.g., H&E) or immunohistochemical (IHC) labelling. Product information and published clinical validation studies were assessed, to understand the quality of supporting evidence on available products, and product details were compiled into a public register: https://osf.io/gb84r/overview. RO_SCPLOWESULTSC_SCPLOWIn total, we identified and assessed 90 CE-marked and 227 RUO AI products. We found that AI products for cancer detection in prostate and breast pathology comprised a substantial portion of the marketplace for H&E image analysis, while IHC products were almost exclusively for use in breast cancer. Clinical validation studies on these products have steadily increased; however, we found that published studies were only available for just over half of H&E products and just over a quarter of IHC products. For CE-marked products, the dataset quality and diversity for AI model performance validation was highly variable, and particularly limited for IHC products. Furthermore, only a limited number of products included studies that assessed measures of clinical utility. CO_SCPLOWONCLUSIONC_SCPLOWAs clinical deployment of AI products for image analysis in histopathology grows, there is a need for transparency, rigorous validation, and clear evidence supporting clinical utility and cost-effectiveness. Independent scrutiny of the expanding offering of AI products provides insight into the opportunities and shortcomings in this domain.

9
Imaging Mass Cytometry (IMC) as a Tool to Characterize Circulating Tumor Cells (CTCs) in Preclinical Mouse Models

Pore, M.; Balamurugan, K.; Atkinson, A.; Breen, D.; Mallory, P.; Cardamone, A.; McKennett, L.; Newkirk, C.; Sharan, S.; Bocik, W.; Sterneck, E.

2026-04-16 cancer biology 10.64898/2025.12.18.695262 medRxiv
Top 0.3%
0.5%
Show abstract

Circulating tumor cells (CTCs), and especially CTC-clusters, are linked to poor prognosis and may reveal mechanisms of metastasis and treatment resistance. Therefore, developing unbiased methods for the functional characterization of CTCs in liquid biopsies is an urgent need. Here, we present an evaluation of multiplex imaging mass cytometry (IMC) to analyze CTCs in mice with human xenograft tumors. In a single-step process, IMC uses metal-labeled antibodies to simultaneously detect a large number of proteins/modifications within minimally manipulated small volumes of blood from the tail vein or heart. We used breast cancer cell lines and a patient-derived xenograft (PDX) to assess antibodies for cross-species interpretation. Along with manual verification, HALO-AI-based cell segmentation was used to identify CTCs and quantify markers. Despite some limitations regarding human-specificity, this technology can be used to investigate the effect of genetic and pharmacological interventions on the properties of single and cluster CTCs in tumor-bearing mice.

10
Cost-function Optimized Maximal Overlap Drift Estimation for Single Molecule Localization Microscopy

Reinkensmeier, L.; Aufmkolk, S.; Farabella, I.; Egner, A.; Bates, M.

2026-03-31 biophysics 10.64898/2026.03.27.714864 medRxiv
Top 0.4%
0.5%
Show abstract

Single-molecule localization microscopy (SMLM) methods enable fluorescence imaging of biological specimens with nanometer-scale resolution. Although fluorophore localization precision is theoretically limited only by photon statistics, in practice the resolution of SMLM images is often degraded by physical drift of the sample and/or the microscope during data acquisition. At present, correcting this effect requires either specialized stabilization systems or computationally intensive post-processing, and established drift correction algorithms based on image cross-correlation suffer from limited temporal resolution. In this study we introduce COMET, a new method for SMLM drift estimation which achieves a substantially higher precision, accuracy, and temporal resolution compared with existing algorithmic approaches. We demonstrate that improved drift estimation translates directly into higher SMLM image resolution, limited by localization precision rather than drift artifacts. COMET is applicable to all types of SMLM data, operating directly on 2D or 3D localization datasets, and is readily integrated into analysis workflows. We benchmark its performance using both simulations and experiments, including STORM, MINFLUX, and Sequential OligoSTORM measurements, where long acquisition times make drift correction particularly challenging. COMET is published as an open-source, Python-based software project and is also available on open cloud-computing platforms.

11
Object Detection Techniques for Live Monitoring of Amoeba in Phase-Contrast Microscopic Images

Chambers, O.; Cadby, A. J.

2026-04-01 biophysics 10.64898/2026.03.30.715415 medRxiv
Top 0.4%
0.4%
Show abstract

In contemporary bio-imaging-based research, computer-based assessment is becoming crucial for the characterisation of biological structures, as it minimises the need for time-consuming human annotation, which is prone to human error. Furthermore, it allows for the use of optical techniques that use lower photon intensities, thereby reducing reliance on high-intensity excitation and mitigating adverse effects on their activities. This study details the development and evaluation of sophisticated deep-learning models for amoeba detection using phase-contrast imaging. Using a single-class annotated dataset comprising 88 images and 4,131 annotations, we developed nine object detection models based on Detectron 2 and six variants based on YOLO v10. The diversity of the dataset, acquired under varying setup parameters, facilitated a comprehensive evaluation of the strengths and limitations of each model. A comparative analysis of speed and accuracy was performed to identify the most efficient models for real-time detection, providing critical insights for future microscopic analyses.

12
Algorithm-Based Model for Gastrointestinal and Liver Histopathological Analysis Using VGG16 and Specialized Stains: Statistical Validation of Thresholds in AI-Driven Digital Pathology

Adeluwoye, A. O.; Gbadegesin, M. O.; James, F. M.; Otegbade, P. S.; Alabetutu, A.

2026-04-11 pathology 10.64898/2026.04.08.26350456 medRxiv
Top 0.4%
0.4%
Show abstract

Digital pathology, coupled with advanced image recognition algorithms, represents a transformative frontier in histopathological diagnosis. This sub-Saharan African laboratorys exploratory study investigates the application of a Convolutional Neural Network (CNN) model, specifically leveraging the VGG16 architecture with transfer learning, for automated analysis and classification of selected gastrointestinal (GIT) and liver tissue samples, incorporating both routine and specialized staining protocols. The study utilized a dataset comprising 114 samples (18 liver, 96 GIT images) derived from archival formalin-fixed paraffin-embedded tissue blocks at University College Hospital, Ibadan, Nigeria. Specialized staining techniques included Alcian Yellow for GIT mucin visualization and Massons Trichrome for liver fibrosis assessment, alongside conventional H&E staining. Model performance was evaluated using statistical methodologies including Wilson Score confidence intervals (CI), Bayesian probability assessment, and effect size analysis. Results reveal a striking dichotomy in model performance. The GIT tissue model achieved perfect classification accuracy (100% test accuracy) with exceptional statistical significance (Z=10.0, p<0.0001), Wilson CI [96.29%, 99.99%], Cohens h=1.571, and Bayesian probability >99.99%. Conversely, the liver tissue model demonstrated diagnostic failure (42.86% test accuracy), with Z=-1.428, p=0.9236, Wilson CI [33.59%, 52.65%], Cohens h=-0.144, and Bayesian probability of 7.64%. This performance divergence correlates with training data availability, as the liver dataset fell far below empirically established thresholds (>100-200 samples) for reliable classification. The liver models failure reveals limitations in transfer learning with insufficient data. These findings underscore critical implications for AI-enhanced digital pathology, demonstrating potential deployment of the GIT model as a promising one that supports tissue-specific model development.

13
fishROI: A specialized workflow for semi-automated muscle morphometry analysis in teleosts

Lu, Y.; Pan, M.; Jamwal, V.; Locop, J.; Ruparelia, A. A.; Currie, P. D.

2026-03-30 cell biology 10.64898/2026.03.27.714781 medRxiv
Top 0.4%
0.4%
Show abstract

Quantitative histological analysis of skeletal muscle morphometry provides critical insights into muscle physiology but remains labor-intensive and technically demanding. While recent developments in machine-learning-based image segmentation techniques have facilitated large-scale tissue analysis, existing tools that automate muscle morphometry analysis are largely tailored to mammalian models, with limited applicability to teleosts. Moreover, there is a lack of effective tools for visualizing spatial organization and morphometric variability of teleost muscle fibers, a feature that is important for understanding hyperplastic muscle growth dynamics in teleosts. In this study, we show that cytoplasmic staining combined with deep learning-based cell segmentation offers a robust and accurate approach for automated muscle morphometry analysis in developing zebrafish. We also introduce a FIJI2 plugin, implemented in Jython, that streamlines both morphometric analysis and visualization. This tool accommodates shallow and deep learning-based segmentation techniques and incorporates novel quantification and visualization methods suited to teleost-specific muscle features, including mosaic hyperplasia dynamics. The plugin features an intuitive graphical user interface and is designed for flexibility, with minimal constraints regarding species, image quality, or staining protocol. Its modular architecture allows it to be used as a baseline for automated muscle morphometry analysis, while permitting integration with other tools and workflows.

14
sRQA: AN INTEGRATIVE PIPELINE FOR SYMBOLIC RECURRENCE QUANTIFICATION ANALYSIS

Curtin, A.; Merriman, E.; Curtin, P.

2026-04-02 systems biology 10.64898/2026.03.31.715624 medRxiv
Top 0.5%
0.3%
Show abstract

Recurrence Quantification Analysis (RQA) is a powerful phenomenological method for characterizing dynamical systems from sequential empirical data, but it is fundamentally limited to continuous signals. Symbolic RQA (sRQA) extends this framework to discrete state sequences, enabling the analysis of both inherently discrete systems and continuous systems where state-based dynamics and motifs are of interest. Despite its promise, accessible and unified software support for sRQA has remained limited. Here we introduce the sRQA package, an open-source R library that consolidates discretization and symbolization, data visualization, and computation of recurrence and cross-recurrence metrics into a single accessible toolset. We validated the method using simulated data with known dynamical properties, confirming that sRQA metrics behaved as theoretically expected. We then demonstrated the utility of sRQA across three real-world applications. First, we applied sRQA to ECG recordings, showing that symbolic recurrence metrics reliably distinguished atrial fibrillation from normal sinus rhythm, with an XGBoost classifier achieving 92% accuracy and an AUC of 0.97. Second, we applied sRQA to fMRI BOLD time series from the dorsal attention network, finding that symbolic and cross-recurrence metrics differentiated movie-viewing from resting-state conditions, revealing greater regularity and inter-subnetwork coordination during task engagement. Third, we applied sRQA to intrinsically symbolized sequences of pauses in speech, identifying valence-specific differences in pause dynamics between truthful and deceptive statements, as well as sex differences in pause structure during negatively-valenced speech. Together, these results demonstrate that sRQA provides a flexible and sensitive framework for characterizing discrete and discretized dynamical systems across biological and behavioral domains. AUTHOR SUMMARYMany biological and behavioral systems are best understood as sequences of discrete states rather than smooth, continuous processes. For example, a heartbeat that shifts between rhythms, a brain that transitions between activity patterns, or a speaker who pauses and resumes in ways that carry meaning. Standard methods for analyzing the dynamics of such systems were not designed with this kind of data in mind. Here, we introduce the sRQA package, an open-source software library that makes it straightforward to apply symbolic recurrence analysis to both discrete and continuous data. We demonstrate the library across four examples: simulated data with known properties, cardiac recordings distinguishing atrial fibrillation from normal heart rhythm, brain imaging data capturing differences between rest and task engagement, and speech recordings where pause patterns differ between truthful and deceptive statements. In each case, sRQA revealed meaningful structure in the data that would be difficult to detect with conventional tools. We hope this library will make symbolic recurrence analysis more accessible to researchers across the biological and behavioral sciences.

15
GlycoDiveR: a modular R framework to analyze and visualize highly dimensional glycoproteomics data

Veth, T. S.; Riley, N. M.

2026-03-24 systems biology 10.64898/2026.03.21.713336 medRxiv
Top 0.5%
0.3%
Show abstract

Mass spectrometry-based glycoproteomics is a critical platform for understanding the complex roles of protein glycosylation in biological systems, yet visualizing multidimensional glycoproteomics datasets remains a significant bottleneck in data interpretation and communication. Glycan microheterogeneity, i.e., the potential for a glycosite to be modified by multiple glycans, defies the binary presence-absence logic used in analyses of other post-translational modifications. Instead, glycoproteomics necessitates intentionally designed data structures and visualizations that are glycoform-centric, not just site-centric. Additionally, there is a need for complementary degrees of data analysis that alternate between glycoproteome-scale patterns and glycosite-specific regulation. Several bespoke frameworks for visualizing glycoproteomics data have emerged, but they often require advanced programming expertise and are designed for a single study rather than broad application. Here, we present our efforts to harmonize post-search data analysis of glycoproteomics through a modular R framework called GlycoDiveR. This platform streamlines import, transformation, and curation of qualitative and quantitative glycopeptide identifications, including support for raw output from multiple search engines. GlycoDiveR is designed to integrate seamlessly into existing analysis workflows by enabling fast, flexible exploration of highly dimensional glycoproteomics datasets via a consistently formatted data architecture. Our goal is to offer a customizable set of glycosylation-specific visualizations with minimal coding, while keeping data accessible to users who wish to further customize their characterization strategies. It also maintains a modular design that supports the continual addition of visualizations, analyses, and export functions. Ultimately, GlycoDiveR is meant to improve accessibility of glycoproteomic-specific analyses and lower the barrier to exploring biological narratives embedded in rich glycoproteomic datasets. GlycoDiveR is open-source and freely available at https://github.com/riley-research/GlycoDiveR.

16
CCIDeconv: Hierarchical model for deconvolution of subcellular cell-cell interactions in single-cell data

Jayakumar, R.; Panwar, P.; Yang, J. Y. H.; Ghazanfar, S.

2026-03-30 bioinformatics 10.64898/2026.03.26.714643 medRxiv
Top 0.5%
0.3%
Show abstract

MotivationCell-cell interaction (CCI) underlies several fundamental mechanisms including development, homeostasis and disease progression. CCI are known to be localised to specific subcellular regions, for example, within the cytoplasms of cells. With the emergence of subcellular spatial transcriptomics technologies (sST), there is an opportunity to attribute CCI to subcellular regions. We aimed to deconvolute CCI to subcellular CCI (sCCI) in non-spatial single cell transcriptomics data (i.e. scRNA-seq) datasets using a modified CCI score from CellChat. ResultsBy calculating the sCCI score specific to cytoplasm and nucleus in nine publicly available sST datasets, we identified unique nucleus-nucleus and cytoplasm-cytoplasm sCCI. Then, we deconvolved the communication score to subcellular regions by using a hierarchical classification and regression model which we name as CCIDeconv. We performed leave-one-dataset-out cross-validation across nine datasets over a range of different tissue types from human samples. We observed that training across many different tissue types resulted in robust deconvolution performance in an unseen dataset. As the number of training datasets increased, models trained without spatial features achieved similar performance as models including spatial features. This implied the potential for accurate prediction of sCCI events from even scRNA-seq with large numbers of training datasets. Overall, we offer a method towards attributing CCI events to subcellular regions. This method can allow researchers in dissecting sCCI patterns to gain insights in underlying biology in a range of tissues covering health and disease.

17
Statistical Principles Define an Open-Source Differential Analysis Workflow for Mass Spectrometry Imaging Experiments with Complex Designs

Rogers, E. B. T.; Lakkimsetty, S. S.; Bemis, K. A.; Schurman, C. A.; Angel, P. A.; Schilling, B.; Vitek, O.

2026-04-10 bioinformatics 10.64898/2026.04.08.717212 medRxiv
Top 0.5%
0.3%
Show abstract

Mass spectrometry imaging (MSI) characterizes the spatial heterogeneity of molecular abundances in biological samples. Experiments with complex designs, involving multiple conditions and multiple samples, provide particularly useful insight into differential abundance of analytes. However, analyses of these experiments require attention to details such as signal processing, selection of regions of interest, and statistical methodology. This manuscript contributes a statistical analysis workflow for detecting differentially abundant analytes in MSI experiments with complex designs. Using a case study of histologic samples of human tibial plateaus from knees of osteoarthritis patients and cadaveric controls, as well as simulated datasets, we illustrate the impact of the analysis decisions. We illustrate the importance of signal processing and feature aggregation for preserving biological relevance and alleviating the stringency of multiple testing. We further demonstrate the importance of selecting regions of interest in ways that are compatible with differential analysis. Finally, we contrast several common statistical models for differential analysis, showcase the appropriate use of replication, and demonstrate model-based calculation of sample size for followup investigations. The discussion is accompanied by detailed recommendations and an open-source R-based implementation that can be followed by other investigations.

18
UQ-PhysiCell: An extensible Python framework for uncertainty quantification and model analysis in PhysiCell

L. Rocha, H.; Bucher, E.; Zhang, S.; Deshpande, A.; Bergman, D. R.; Heiland, R.; Macklin, P. R.

2026-04-08 systems biology 10.64898/2026.04.06.716692 medRxiv
Top 0.6%
0.3%
Show abstract

Agent-based models (ABMs) are widely used to study complex multiscale biological systems, particularly in cancer research. However, their high-dimensional parameter spaces, stochasticity, and computational costs pose significant challenges for uncertainty quantification, calibration, and systematic comparison of competing mechanistic hypotheses. PhysiCell has evolved into a growing ecosystem of open-source tools supporting physics-based multicellular modeling, including model construction, visualization, and data integration. However, despite these advances, systematic support for uncertainty-aware model analysis, scalable parameter exploration, and formal calibration workflows remains limited. Here, we introduce UQ-PhysiCell, an open-source Python package that enables uncertainty quantification, calibration, and model selection for PhysiCell models using a modular and scalable workflow. UQ-PhysiCell acts as a manager of PhysiCell simulation inputs and outputs, including parameters, initial conditions, rules, and MultiCellDS-compliant objects, and provides automated orchestration of large ensembles of simulations. The framework supports multiple levels of parallelism to accelerate the analysis, including the parallel execution of independent simulations, stochastic replicates, and downstream analysis tasks. UQ-PhysiCell integrates seamlessly with established Python libraries for sensitivity analysis, optimization, Bayesian inference, and surrogate modeling, allowing users to construct customized pipelines that match their modeling goals and computational resource requirements. By decoupling model execution from statistical analysis and emphasizing extensibility and reproducibility, UQ-PhysiCell lowers the barrier to applying rigorous uncertainty-aware methodologies to agent-based modeling and supports the systematic evaluation of PhysiCell models in biological and biomedical research. Author summaryWe developed UQ-PhysiCell to address a key challenge in agent-based modeling: the systematic quantification of uncertainty in complex stochastic simulations. PhysiCell is widely used to model multicellular biological systems, particularly in cancer research; however, practical tools for uncertainty analysis, calibration, and model comparison are often developed in an ad hoc manner. This makes the results difficult to reproduce and limits the ability to rigorously evaluate competing biological hypotheses. UQ-PhysiCell provides a flexible Python framework that manages the inputs and outputs of PhysiCell simulations and enables large-scale computational analysis. We designed the software to be modular, allowing users to build their own analysis pipelines and combine different methodologies for sensitivity analysis, calibration, and model selection. Rather than enforcing a single workflow, UQ-PhysiCell supports customization to match specific scientific questions and computational requirements. To make uncertainty-aware analyses feasible for computationally intensive agent-based models, UQ-PhysiCell implements multiple parallelism strategies, enabling the concurrent execution of simulations, stochastic replicates, and downstream analyses. By promoting reproducibility, scalability, and methodological flexibility, UQ-PhysiCell helps researchers move beyond single best-fit simulations toward more reliable and interpretable computational modeling.

19
Cross-Species Morphology Learning Enables Nucleic Acid-Independent Detection of Live Mutant Blood Cells

Khan, S. A.; Faerber, D.; Kirkey, D.; Stirewalt, D.; Raffel, S.; Hadland, B.; Deininger, M.; Buettner, F.; Zhao, H. G.

2026-03-25 pathology 10.1101/2025.10.20.682949 medRxiv
Top 0.6%
0.3%
Show abstract

In both neonates and adults, the presence of malignancy-associated mutations in peripheral blood (PB) correlates with an elevated risk of future neoplastic transformation, with certain mutations, such as KMT2A rearrangements, exhibiting near-complete penetrance. If feasible, pre-malignant screening could enable early intervention and even disease prevention. However, nucleic acid sequencing- and hybridization-based mutation detection have limited cost-efficiency, constraining their use in screening. Here, we introduce a computer vision platform that can identify mutant cells in fresh PB samples that carry KMT2A-MLLT3 (a frequent mutation in pediatric and adult leukemias and detectable in newborn blood samples) or JAK2-V617F (a frequent mutation in myeloproliferative neoplasms and clonal hematopoiesis). This is achieved by high-throughput single-cell imaging and mutation detection by machine learning (ML)-powered morphology recognition. The ML models were developed by cross-species learning of conserved features between mutant cells from mouse genetic models and from human samples, enabling a cost-effective approach for detecting mutations in live blood cells. This platform holds promise for pre-malignant screening in asymptomatic neonates and adults with KMT2A-MLLT3 or JAK2-V617F mutation and is potentially generalizable to the detection other malignancy-associate mutations. Our platform provides a novel single-cell morphological data modality that complements existing single-cell genomics.

20
Representation Methods of Transcriptomics with Applications in Neuroimmune Biology

Abbasi, M.; Ochoa Zermeno, S.; Spendlove, M. D.; Tashi, Z.; Plaisier, C. L.; Bartelle, B. B.

2026-04-07 bioinformatics 10.64898/2026.04.03.716238 medRxiv
Top 0.6%
0.3%
Show abstract

Interpretable representations of gene expression are used to define cellular identities and the molecular programs active within cells, two related, but distinct phenomena. In the case of microglia, a cell type with high transcriptomic, functional, and morphological heterogeneity, the predominant representation of transcriptomic data presumes the adoption of distinct molecular identities, despite a lack of easily separable transcriptional states. Here, we explore alternative transcriptomic representations by comparing two single-cell analysis methods: differential expression analysis for identities and co-expression network analysis for molecular programs. For microglia, co-expression network analysis identifies highly significant functional ontologies not resolved by differential expression analysis. The identified co-expression modules are preserved across transcriptomic datasets and suggest reducible functional programs that activate and modulate depending on context. We conclude that co-expression analysis constitutes a best practice for single cell analysis of an individual cell type and describing microglia function as concurrent molecular programs offers a more parsimonious model of microglia function.